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APPARATUS AND METHOD FOR EMBEDDING WATERMARK 
INFORMATION, APPARATUS AND METHOD FOR DETECTING 
WATERMARK INFORMATION, AND DOCUMENT CONTAINING 

WATERMARK 

5 

FIELD OF THE INVENTION 

The invention relates to method and apparatus for embedding 
watermark information in document images; and associated method and 
apparatus for detecting such watermark information from the document 
10 images with watermark information embedded therein according to the 
method and apparatus for embedding watermark information; and document 
containing watermark. 

BACKGROUND OF THE INVENTION 

1 5 A digital watermark, which is traditionally associated with a document, 

is embedded for preventing from being copied and forged, usually in a 
manner not alerting human viewers that such information is present. Such 
meta-information embedded in the digital watermark is particularly the case 
that stored and transmitted in relation to digital media and not easy to be 

20 decreased or vanished. Thus, such meta-information can be detected 
robustly and reliablely. Similarly, it is necessary to provide methods and 
systems for verifying the authenticity of paper media, such as documents 
and articles. The document or article should be inconspicuously embedded 
confidential structure characteristic of meta-information therein in order to 

25 provide reasonable security against forgery 

Conventionally, as disclosed in Official Gazette of Japanese Patent 
Application Laid-Open No. 09-179494, confidential information to be 
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recorded is binarized into blocks. The confidential information is denoted 
with data of distances (or pixels) between reference point marks and position 
discrimination marks. 

However, in the above mentioned conventional technique, the image 
5 inputted by a scanner or other input apparatus must be accurately 
manipulated at the granularity of a single pixel during detecting process. If 
there is any spot on the paper or noise interference during printing or 
reading, that will give birth to a great influence on information detecting. 
Moreover, in the above mentioned conventional technique, when 

10 scanning the print document into a computer by a scanner or other input 
apparatus, and detecting the confidential information, the inputting image 
will contain lots of interference noise components because of the spots on the 
print document introduced in printing and rotation distortion introduced in 
scanning, thereby causing the confidential information hardly to be read 

1 5 correctly. 

BRIEF SUMMARY OF THE INVENTION 

The invention is made in consideration of the above problems and it 
provides following preferable configuration. 

20 In accordance with a first aspect of the invention, a watermark 

information embedding apparatus comprises a document image generating 
section for generating a document image! a watermark image generating 
section which uses dot pattern to denote watermark information, and 
generates watermark image in which an outline of recording area of the 

25 watermark information is denoted by dot pattern indicating special value; 
and a synthesizing section for overlapping the document image and the 
watermark image so as to generate a containing watermark document 
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image. 

In accordance with a second aspect of the invention, a watermark 
information embedding apparatus comprises a document image generating 
section for generating a document image, a PN code generating section for 
5 generating PN code, a watermark image generating section for diffusing 
prescript watermark information by using the PN code, generating diffusing 
watermark information and generating a watermark image in which the 
diffusing watermark information is denoted by dot pattern, and a 
synthesizing section for overlapping the document image and the watermark 

10 image so as to generate a containing watermark document image. 

Preferably, in accordance with the second aspect of the invention, the PN 
code generating section generates at least one PN code, and the watermark 
image generating section utilizes the at least one PN code to diffuse the 
prescript watermark information with respect to row unit or column unit. 

15 Preferably, the PN code generating section generates two-dimensional 

PN code which is different from or is same with that representing row 
direction and column direction respectively. 

In accordance with a third aspect of the invention, a watermark 
information embedding apparatus comprises a document image generating 

20 section for generating a multipage document image, a PN code generating 
section for generating three-dimensional PN code which is different from or 
is same with that representing row direction, column direction and page 
direction respectively, a watermark image generating section for generate a 
multipage watermark image, and a synthesizing section for overlapping the 

25 multipage image and corresponding watermark image so as to generate a 
containing watermark document image. Wherein the PN code generating 
section generates two-dimensional PN code which is configured by PN codes 
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with respect to row direction and column direction according to prescript 
watermark information. The watermark image generating section uses the 
two-dimensional PN code to diffuse the prescript watermark information so 
as to generate the watermark image of one page, and uses the PN code in the 
page direction to diffuse so as to generate the multipage watermark image. 

Preferably, the multiple dot pattern are configured in one surface, and 
wherein there is at least a dot pattern representing special watermark 
information. 

In accordance with a fourth aspect of the invention, a watermark 
information detecting apparatus for extracting watermark information, 
which being represented as a watermark image having multiple dot pattern 
configured in one surface thereof, from a document comprises a watermark 
information detector. The watermark information detector discriminates 
area of the watermark information according to detected outline 
representing special value. 

In accordance with a fifth aspect of the invention, a watermark 
information detecting apparatus for extracting watermark information, 
which is diffused by PN code and represented as a watermark image, from a 
document comprises a watermark detector. The watermark information 
detector extracts the watermark image from the document, and estimates 
area of the watermark information via calculating correlation between the 
watermark image and the PN code. 

Preferably, the watermark detector discriminates whether the 
watermark information is correctly detected according to correlation peak 
value of the PN code, if the watermark information can't be detected 
correctly, the watermark detector performing prescript correction. 

Preferably, the watermark detector calculates correlation values using 
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different PN codes, detects correlation peak value of each PN code, and 
estimates row address and column address according to the correlation peak 
value. 

Preferably, the watermark detector calculates correlation of 
5 two-dimensional PN code, which includes different kinds of PN codes in row 
direction and column direction respectively, so as to estimate the area of the 
watermark information. 

Preferably, the document is composed by multipage. The watermark 
detector calculates correlation of three-dimensional PN code, which includes 
1 0 different kinds of PN codes with respect to row direction, column direction 
and page direction, so as to estimate the area of the watermark information. 

Preferably, the multiple dot pattern are configured in one surface, and 
wherein there is at least a dot pattern representing special watermark 
information. 

15 In accordance with a sixth aspect of the invention, a method of 

embedding watermark information comprises following steps: representing 
the watermark information with dot pattern by a watermark information 
embedding apparatus; generating a watermark image by using dot pattern 
representing special value to represent a outline of a watermark information 

20 area; and generating a containing watermark document image by 
overlapping the watermark image and the prescript document image. 

In accordance with a seventh aspect of the invention, a method of 
embedding watermark information comprises following steps: generating a 
watermark image through utilizing a watermark information embedding 

25 apparatus to diffuse prescript watermark information by PN code; 
synthesizing the watermark image and prescript document image so as to 
generate the synthesized image; and outputting the synthesized image. 
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Preferably, the multiple dot pattern are configured in one surface, and 
wherein there is at least a dot pattern representing special watermark 
information. 

In accordance with a eighth aspect of the invention, a method of 
5 detecting watermark information for utilizing a watermark information 
detecting apparatus to extract watermark information, which is represented 
as a watermark image having multiple dot pattern configured in one 
surface thereof, from a document. The method comprises the steps of- 
detecting a outline representing special value from the watermark image; 
10 and estimating the area of the watermark information according to the 
outline. 

In accordance with a ninth aspect of the invention, a method of detecting 
watermark information for utilizing a watermark information detecting 
apparatus to extract watermark information, which is diffused by PN code 

15 and represented as a watermark image, from a document. The method 
comprising the steps of- extracting the watermark image; calculating 
correlation between the watermark image and the PN code; and estimating 
the area of the watermark information according to previous steps. 

Preferably, the multiple dot pattern are configured in one surface, and 

20 wherein there is at least a dot pattern representing special watermark 
information. 

In accordance with a tenth aspect of the invention, a method for generating a 
containing watermark document comprises following steps- generating a 
watermark image by using PN code to diffuse prescript watermark 
25 information; and synthesizing the watermark image and prescript 
document. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is explained in further detail, and by way of example, with 
reference to the accompanying drawings wherein^ 

FIG. 1 is a block diagram of a watermark information embedding 
apparatus and a watermark information detecting apparatus in a first 
embodiment according to the present invention; 

FIG. 2 is a flow chart showing steps of a watermark image generator 
appear in FIG. 11 

FIG. 3 is an example of a watermark encoding in the first embodiment; 

FIG. 4 is a sectional view shown pixel value variability in FIG. 3 taken 
from arctan(l/3) direction; 

FIG. 5 is an explanation view of other watermark information; 

FIG. 6 is a schematic view showing the state of unit configuration; 

FIG 7 is a schematic view showing one character of the codes being 
embedded in the watermark image; 

FIG. 8 is a flow chart showing the watermark information embedded in 
the watermark image; 

FIG. 9 is a schematic view showing embedding process of the watermark 
information; 

FIG. 10 shows unit images surrounding a outline of the watermark 
information area; 

FIG. 11 is an example of the containing watermark document image; 
FIG. 12 is an enlarged view of FIG. li; 

FIG. 13 is a flow chart showing steps of a watermark detector; 
FIG. 14 is an example of an input image being compartmentalized into 
unit images; 

FIG. 15 is an example showing Unit A in FIG. 3 (l) of the input image; 



FIG. 16 is a sectional view of FIG. 15 taken from a direction parallel to 
the DOA (direction of arrival) of the wave; 

FIG. 17 is a schematic view showing discriminating process of the 
symbol unit; 

5 FIG. 18 is schematic view showing an example of reconverting process of 

the information; 

FIG. 19 is a flow chart of reconverting process of the data code; 

FIG. 20 is a schematic view showing the reconverting process of the data 
code; 

10 FIG. 21 is a schematic view showing bit confidence operation; 

FIG. 22 is a block diagram of a watermark information embedding 
apparatus and a watermark -information detecting apparatus in a second 
embodiment according to the present invention; 

FIG. 23 is a schematic view showing configuration of the shift register 
1 5 code generator; 

FIG. 24 is a schematic view showing configuration of 4 longest code 
sequence generator; 

FIG. 25 is a schematic view showing an auto correlation function of the 
longest code sequence; 
20 FIG. 26 is a schematic view showing generation of the watermark image; 

FIG. 27 is a schematic view showing process of the watermark detector; 
FIG. 28 is a schematic view showing processes of a third embodiment 
(part one); 

FIG. 29 is a schematic view showing processes of the third embodiment 
25 (part two); 

FIG. 30 is a schematic view showing two-dimensional PN code sequence; 
FIG. 31 is a schematic view showing detecting process of the 



two-dimensional PN code sequence in the fourth embodiment; 
FIG. 32 is an example of two-dimensional PN code sequence. 
FIG. 33 is an example of three-dimensional PN code sequence. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 
[First Embodiment] 

A first embodiment in the present invention has been made to solve the 
aforementioned problems, and collocates with special value of signal image 
in a outline of a watermark information area. For example, in the first 
embodiment, the signal image denoting "1" is configured around the outline 
of the watermark information area. 

FIG. 1 is a block diagram depicting a watermark information embedding 
apparatus and a watermark information detecting apparatus in the first 
embodiment according to the present invention. A system apparatus shown 
in FIG. 1 comprises a watermark information embedding apparatus 100 and 
a watermark information detecting apparatus 300. A document containing 
watermark information 200 is outputted by the watermark information 
embedding apparatus 100 and serves as a detecting object of the watermark 
information detecting apparatus 300. 

The watermark information embedding apparatus 100 is substantially a 
computer, which can generate document image according to data and 
watermark information embedded in the document, and then print them in 
paper media. The watermark information embedding apparatus 100 
comprises a document image generator 101, a watermark image generator 
102, a containing watermark document image synthesizer 103 and an output 
device 104. Document data 105 is generated by document generating 
instruments. Watermark information 106 is perceptually invisible messages 



embedded in paper media content such as symbol array, video or audio data 
except for word. The document data 105 and watermark information 106 are 
stored in memorizer, such as hard disk or semiconductor storage, and can be 
read in from external through network port. 

The document image generator 101 is a functional device performing for 
transforming the document data 105 into a document image and printing the 
image on a paper media. To be specific, white pixel area in the document 
image is a blank area during printing, black pixel area in the document 
image is an area that coated with black dope. In following descriptions about 
preferable embodiments of the present invention, printing technologies 
employ black ink (single color). However, the present invention is not limited 
to this, printing technologies employing multicolor can also be applied. 

The watermark image generator 102 is provided for encoding the 
numerical value N (N>=2) digitized from the watermark information 106, 
and distributing each symbol of the code into corresponding prepared signals. 
The signals utilize arranging dots in rectangle area of random size to 
represent wave of random direction and wavelength. Then the signals 
distribute corresponding symbols to the direction of wave and the 
wavelength. The watermark image is configured by configuring such signals 
to the image according to a certain rule. In other words, the watermark 
image generator 102 has a function that generating the watermark 
information 106 as a dot pattern. A detailed description about the 
watermark image generating process will be given hereinbelow. 

The containing watermark document image synthesizer 103 is provided 
to overlap the document image and the watermark image thereby generating 
the containing watermark document image. The containing watermark 
document image synthesizer 103 generates the containing watermark 
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document image from the dot pattern denoted with special value. The 
special value represents a outline of an area of the watermark information in 
the watermark image. 

The output device 104 is substantially a printer for printing and 
5 outputting the containing watermark document image. The document image 
generator 101, the watermark image generator 102 and the containing 
watermark document image synthesizer 103 can realized as one function of a 
print driver or by individual software. 

The containing watermark document 200 is prints that embedding the 
10 watermark information 106 into the original document data 105. The dot 
pattern of the watermark image is configured as a shading image of the 
document image. 

The watermark information detecting device 300 is provide to recognize 
the watermark information embedded in the containing watermark 

1 5 document 200. The watermark information detecting device 300 includes an 
input device 301 and a watermark detector 302. 

The input device 301 is substantially a scanner, which is provided to 
read the image on the containing watermark document 200 as 
multi-luminance gray images. The watermark detector 302 is provided to 

20 filter the input image, detect the embedded signal, revert symbol to the 
detected signal, and extract the embedded watermark information. Moreover, 
the watermark detector 302 has a function that can discriminate the outline 
of the watermark information area and define the area configured by the 
outline of the watermark information as a record area of the watermark 

25 information when detected an area with continuous special value from the 
signal. 

Following is a detailed explanation about the flows of the watermark 
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information embedding apparatus 100 and the watermark information 
detecting apparatus 300. Now, description about the watermark information 
embedding apparatus 100 is given. 

First to the document image generator 101, the document data 105 is 
5 generated by a word processing software and so on, and contains font and 
collocation information. The document image generator 101 generates image 
of every page according to the document data 105 printed on the papers. The 
document image is a two-value image of black and white. The white pixels 
(value=l) are taken as background, and the black pixels (value=0) are taken 
1 0 as word area (ink painting area). 

Then to the watermark image generator 102, the watermark 
information 106 includes symbol data, video data, audio data and so on. The 
watermark image generator 102 generates the watermark image 
overlapping the background of the document image from the above 
1 5 mentioned data. 

FIG. 2 is a flow chart of the watermark image generator 102. Referring 
to FIG. 2, the watermark image generator 102 includes following three steps. 

Firstly, step SlOl, which is provided to transform the watermark 
information 106 into N metacode. N can be a random number. In order to 
20 explain conveniently, it is supposed that N =2 in the first embodiment. 
Therefore, the step SlOl will generate binary code, which is represented as 
bit array of 0 and 1. In the step SlOl, the data can be encoded directly, and 
can also be encoded into encrypted data. 

Secondly, step S102, which is provided to distribute the watermark 
25 signal to each symbol of the code. The watermark signal utilizes arranging 
dots to indicate wave signal of random direction and wavelength. A detailed 
illustration about the watermark signal will be discussed hereinafter. 



Then, step S103, which is provided to arrange the signal unit of the bit 
array corresponding to the binary code onto the watermark image. 

A detailed illustration about distributing watermark signal to each 
symbol of the code in the step S102 is now discussed. 
5 FIG. 3 is a schematic diagram showing an example of the watermark 

signal. 

It is supposed that width and height of watermark signal are Sw and Sh 
respectively. Sw and Sh can be chosen different values. Here, in order to 
explain conveniently, it is supposed that Sw=Sh in this embodiment. 

10 Referring to FIG. 3, supposed Sw=Sh=12, and the unit of length is numbers 
of pixel. The size on paper printed from the signal is determined by the 
resolution of the watermark image. For example, if the watermark image is 
600dpi (dot per inch, unit of resolution, that is dots within one inch), the 
width and height of the watermark signal in containing watermark 

1 5 document is 12/600=0.02 (inch). 

The following is to take a rectangle of which width is Sw and height is 
Sh as one unit of signal, call as signal unit. In FIG 3(l), distances between 
dots are gathered in an arctan (3) (arctan is an inverse function of tan) 
direction relative to a horizontal axis, the direction of arrival (DOA) of wave 

20 is arctan (-1/3). Here, in FIG. 3 (l), the signal unit is called "Unit A". In FIG. 
3 (2), distances between dots are gathered in an arctan (-3) direction relative 
to the horizontal axis, the DOA (direction of arrival) of wave is arctan (1/3), 
and the signal unit is called "Unit B". 

FIG. 4 is a sectional view showing pixel variation of FIG. 3 (l) taken 

25 from arctan (1/3) direction. 

In FIG. 4, an area where arrays dots is wave trough, and an area where 
doesn't array dots is wave crest. There are two areas that collected with dots 



in one unit, and thus, the frequency of one unit in this embodiment is 2. The 
direction of arrival (DOA) of wave is perpendicular to the direction of 
collection of dots. The wave of Unit A is in arctan (-1/3) direction relative to 
the horizontal axis, and the wave of Unit B is in arctan (1/3) direction 
5 relative to the horizontal axis. When the arctan (a) direction is perpendicular 
to the arctan (b) direction, aXb=-l. 

In this embodiment, the watermark signal of Unit A is defined as symbol 
0, and the watermark signal of Unit B is defined as symbol 1, and call such 
symbol as symbol unit. 
1 0 During the watermark signals, following dot array can be considered in 

addition to those shown in FIG. 3 (l), (2). 

FIG. 5 is an illustration view of other watermark signals. 
As shown in FIG. 5 (3), distances between dots are collected in an arctan 
(1/3) direction relative to the horizontal axis, and the DOA of wave is an 
1 5 arctan (-3) relative to the horizontal axis. Such signal unit is called "unit C". 

As shown in FIG. 5 (4), distances between dots are collected in an arctan 
(-1/3) direction and DOA of the wave is an arctan (3) direction relative to the 
horizontal axis, respectively. Such signal unit is called "unit D". In FIG. 5 (5), 
distances between dots are collected in an arctan (l) direction and the DOA 
20 of wave is an arctan (-1) direction. There can also be considered that, in FIG. 
5 (5), distances between dots are collected in an arctan (-1) direction relative 
to the horizontal axis, and the DOA of wave is an arctan (l) relative to the 
horizontal axis. Signal unit in FIG. 5 (5) is called "unit E". 

In this way, the combination images of distributed symbol 0 and unit can 
25 be plurality in addition to the above mentioned combinations. Therefore, 
which watermark signal distributed to which symbol is confidential, and 
other users (unauthority users) are hard to recognize the embedded signal. 



Furthermore, in the step S102 shown in FIG. 2, if the watermark 
information is encoding with 4 metacode, symbol 0 is distributed to Unit A, 
symbol 1 is distributed to Unit B, symbol 2 is distributed to unit C, and 
symbol 3 is distributed to unit D. 

With reference to the example shown in FIG. 3 and FIG. 5, it is assumed 
that the numbers of dots within one unit is a constant, and such units are 
configured without interruption, thereby the shade of the watermark image 
appearing uniformly. Thus, it seems that a gray image with single density is 
embedded as a backdrop on the print paper. 

For example, in order to obtain the mentioned effect, unit E is defined as 
backdrop unit (signal unit that isn't distributed symbol), configured without 
interruption and taken as a backdrop of the watermark image. When symbol 
unit (Unit A, Unit B) is embedded in the watermark image, the position 
where the symbol unit (Unit A, Unit B) to be embedded replaces the 
backdrop unit (unit E). 

FIG. 6 is a schematic view representation the state of unit configuration. 
In FIG. 6 (1), unit E is defined as backdrop unit, and configured without 
interruption to be a backdrop of the watermark image. In FIG. 6 (2) is an 
example showing that Unit A is embedded in the watermark image in FIG. 6 
(1). FIG. 6 (3) is an example showing that Unit B is embedded in the 
watermark image in FIG. 6 (l). In this embodiment, utilizing backdrop unit 
as the backdrop of the watermark image is illustrated. The watermark image 
can also be generated via collocation symbol unit. 

FIG 7 is a schematic view showing that one symbol of the codes are 
embedded in the watermark image. FIG. 7 shows an example that "0101" bit 
array is embedded. As shown in FIG. 7 (l) and FIG. 7 (2), the same symbol 
unit is embedded repeatedly so as to be detected correctly after words in the 
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document are overlapped the symbol unit. The embedded times of the symbol 
unit and distribution images (called "unit image") are random. 

FIG. 7 (l) is an example of unit image. In FIG. 7 (l), the embedded times 
is four (there is four symbol units in one unit image). In FIG. 7 (2), the 
5 embedded times is two (there is two symbol units in one unit image). The 
embedded times can also be defined as one (there is one symbol unit in one 
unit image). 

In FIG. 7 (1) and FIG. 7 (2), one character is distributed to one character 
unit. In FIG.7 (3), the distribution image of the character unit is provided 

1 0 with character. 

How many bits information that can be embedded in the watermark 
image of one page is determined by the size of the signal unit, size of the unit 
image and size of the document image. Whether signals have being 
embedded in the horizontal and vertical directions of the document image 

15 can be detected or worked out by counter-operation of the size of inputted 
image and signal unit. 

It is supposed that there are Pw amount of unit images embedded in 
horizontal direction of the watermark image of the one page, and Ph amount 
of unit images embedded in vertical direction of the watermark image. Thus, 

20 the unit image at random position can be represented as U Cx> y) , which is 
called "unit image array", and wherein x=l~Pw, and y=l~Ph. The bits of 
information that can be embedded within one page is equal to Pw X Ph, and 
is named as embedded bit amount. 

FIG. 8 is a flow chart showing the watermark information embedded in 

25 the watermark image. An illustration about repeatedly embedding the same 
information in the watermark image of one sheet (one page quotient) is given. 
Even a whole unit image is wiped away and the embedded information 



therein is lost, the embedded watermark information can still be read out 
because of the same information is repeatedly embedded and overlapping the 
document image. 

Step S201: transforming the watermark information 106 into N 
5 metacode. Processes in this step are similar to the step SlOl in FIG. 2. Here, 
data to be encoded is named as data code, data code represented as 
combination of unit image is called "data code unit' Du. 

Step S202- working out repeated embedded times that the data code unit 
of one page image can be embedded according to the code length (bit amount) 
1 0 of the data code and embedded bit amount. Data of the code length of the 
data code are embedded in the first line of the unit image array except for 
the outline of the watermark image area. The code length of the data code 
can be also defined as a constant, and the data of the code length don't be 
embedded in the watermark image. 
1 5 The data code length is denoted with Cn, and the embedded times Dn of 

the data code unit can be worked out as following formula : 

[Formula 1]: Dn = 

L C'w 

It is supposed that (A) is represented as the maximal integer which is 
not more than A. 

20 If the remainder is represented as Rn (Rn=Cn— (PwX (Ph-^l))),the 

remainder Rn is equal to the data code unit, which is generating after 
embedding for Dn times in unit image array, and the unit image of Rn bit 
shares before the data code. However, the remainder Rn bit shares are not 
always embedded in. 

25 FIG. 9 is an example showing embedding process of the watermark 

information. As shown in FIG. 9, the size of the unit image array is 9 X 11 (11 
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rows, 9 columns) , the data code length is 12 (the code characters of the data 
codes are denoted with 0—11 in FIG. 9). 

Step S203- embedding data of code length in the first row of the unit 
image array. In FIG. 9, the diagram showing an example that the code length 
is 9 bits, and the watermark information is embedded only once. However, 
similar to the data code, the data of the code length can be embedded 
repeatedly as long as the width Pw of the unit image array is large enough. 

Step S204.' repeatedly embedding the data code unit in since the second 
row of the unit image array. As shown in FIG. 9, the data code unit is 
embedded in turn along a row direction from a MSB (most significant bit) or 
LSB (least significant bit) of the data code. FIG. 9 shows an example that the 
data code unit is embedded for seven times and six bits data before the data 
code are embedded. 

As shown in FIG. 9, data can be embedded along the row continuously, 
and can also be embedded along the column continuously. 

An illustration about embedded outline in this embodiment is given. 

The watermark generator 102 generates a containing watermark 
document image just like FIG. 9. The unit images representation 1 are 
configured contiguously around the watermark information area. 

FIG. 10 shows unit images surrounding a outline of the watermark 
information area. As shown in FIG. 10, the unit images representation 1 are 
configured contiguously around the watermark information area, while 
image in the watermark information area is omitted. 

The above description is about how the watermark image generator 102 
generates the watermark image. And now to explain the containing 
watermark document image synthesizer 103. 

The containing watermark document synthesizer 103 is provided to 
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make the document image generated by the document image generator 101 
and the watermark image generated by the watermark generator 102 
overlap each other. Each pixel value of the containing watermark document 
image is worked out via ANDing with pixels of the document image and 
5 watermark image. That is to say, as long as one pixel value of the document 
image and the watermark image is 0 (black), the pixel value of the 
containing watermark document image is 0 (black), while others are 1 
(white). 

FIG. 11 is an example of the containing watermark document image. 
1 0 FIG. 12 is an enlarged view of FIG. 11. 

In FIG. 12, the unit image is adopted the image shown in FIG. 7 (l). The 
containing watermark document image is outputted by the output device 
104. 

The above illustration is about the watermark information embedding 
1 5 apparatus 100. Now description about the watermark information detecting 
apparatus 300 is discussed with reference to FIG. 1 and FIGS. 13-21. 

FIG. 13 is a flow chart showing steps of the watermark detector 302. 
Step S301' inputting the containing watermark document image to the 
storage of a computer through the input device 301 such as a scanner. Such 
20 image is called "input image". The input image can be a multi bitmap image. 
The following explanation is about the gray image of 256 gray grade. The 
resolution of the input image (reading resolution of the input device 301) can 
also be different from that of the containing watermark document image 
generated by the watermark information embedding apparatus 100. In this 
25 embodiment the resolution of the input image is equal to that of the 
containing watermark document image. It is supposed that the input image 
is revised by several processes such as rotation, flexing and so on. 

19 
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Step S302: working out numbers of the embedded unit images according 
to the size of the signal unit and the input image. For example, it is supposed 
that the size of input image is W (width) XH (height), the size of the signal 
unit is SwXSh, and the numbers of the unit image are UwXUh, then the 
numbers of the embedded unit image in the input image are N=PwXPh, 
which can be worked out as following formula: 

W H 
[Formula 2] : = , Ph = 



Swx^Uw ShxUh 

If the resolution of the watermark information embedding apparatus 
100 is different from that of the watermark information detecting apparatus 
10 300, the numbers N of the embedded unit images can also be calculated 
according to the Formula 2 after normalize the signal unit of the input image 
according to the resolution ratio of the watermark information embedding 
apparatus 100 and the watermark information detecting apparatus 300. 

Step 8303^ compartmentalizing the input image according to the 
1 5 numbers of the unit images generated from the step S302. 

FIG. 14 shows an example of an input image (FIG. 14 (l)) and an 
example of the input image after being compartmentalized into unit images 
(FIG. 14 (2)). 

Step S304: detecting character unit of the compartment of each unit 
20 image, reconverting the unit image array. A detailed illustration about the 
signal detecting is discussed as follows. 

FIG. 15 is an example showing Unit A in FIG. 3 (l) of the input image. 
The signal unit in FIG. 3 is a two bitmap image, while in FIG. 15 is a 
multi bitmap image. As shown in FIG. 15, when printing a two bitmap image, 
25 the shade of the image is varied continuously because of ink infiltration or 
other factors thereby rendering demitint between white and black 
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distributing around the dots. 

FIG. 16 is a sectional view of FIG. 15 taken from a direction parallel to 
the DOA (direction of arrival) of the wave. 

With reference to drawings, the wave in FIG. 4 is a rectangular wave, 
while in FIG. 16 is a circular wave. 

The input image is a£&xed with kinds of interference noises with respect 
to partial variety of the thickness of the paper, spots of the print document, 
instability of the output device or the image input device, etc. The 
description discussed here is in condition that there is no any interference 
noise in the input image. However, steady signal can also be detected from 
the input image with interference noises using methods in this invention. 

In order to detect the signal unit from the input image, a 
two-dimensional wavelet filter which can define frequency, direction and 
swing of the wave simultaneously is used. Gabor filer is an example of the 
two-dimensional wavelet filter. Filters are of same function of the Gabor 
filter can also be used. A method that defining a pattern having the same 
dot pattern with the signal unit and performing pattern matching can also 
be used. 

Following parameters such as G (x^ y), x=0~gw— 1 andy=0~gh— 1 are 
used to represent the Gabor filter. Wherein the size of the filter is denoted 
with gw and gh, which is equal to that of the embedded signal unit of the 
watermark information embedding apparatus 100. 
[Formula 3] - 



— z H - 



5^ 



X exp[- Im^ix - xO) + vCf - >;0)}] 



wherein- I is an imaginary number imit; 

x=0~gw— 1, y=0~gh— 1, x0=gw/2, y0=gh/2; 
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A: swaying scope in the horizontal direction; 
B: swaying scope in the vertical direction; 
tan — 1 ( u/v ) : direction of arrival (DOA) of the wave ; 
Vw^ : frequency. 

5 In the signal detecting process, same kinds and numbers of Gabor filters 

corresponding to the signal units to be embedded are provided. The Gabor 
filters have the same wave's frequency, DOA and size as those of the 
character unit to be embedded in the watermark image. The Gabor filters 
are called Filter A and Filter B respectively with respect to Unit A and Unit 
10 Bin FIG. 3 

The output value of the random position in the input image outputted 
from the filter is worked out according the convolution of the filter and the 
image. The Gabor filter includes a real number filter and an imaginary 
number filter. The real number filter and the imaginary number filer have a 
15 half-wavelength phase error therebetween. The square root of the real 
number and the imaginary number is an output value of the Gabor filter. 

For example, the convolution of the real number filter of the Filter A and 
the input image is denoted with Rc, the convolution of the imaginary umber 
filter of the Filter A and the input image is denoted with Ic, and the output 
20 value F (A) can be represented as following Formula 4. 

[Formula 4] : FiA) = V^?Tic^ 

FIG. 17 is a schematic view showing process to discriminate whether the 
character unit of the embedded unit image U (xn y ) compartmentalized from 
the step S303 is Unit A or Unit B 
25 A detailed explanation about the discrimination of the unit image U (x^ 

y) is discussed as following steps- 

(1) scrolling the Filter A, and counting F (A) according to the unit image 



U (x> y) at the same time, so as to obtain a maximal value. The maximal 
value of F (A) is taken as an output value of the unit image U (xn y ) from the 
Filter A and denoted with Fu (A> x> y) . 

(2) similar to the step (l), working out an output value of the unit image 
5 U (xx y) from the Filter B, and recording the output value as Fu (B^ x. y) 

(3) Comparing Fu (A^ y) with Fu (A^ x. y) , if Fu (A. x^ y) ^ Fu 
(B. x> y) , discriminating the character unit of the embedded unit image U 
(xn y) as Unit A, while if Fu (A. x^ y) < Fu (B. y) , discriminating the 

character unit of the embedded unit image U (x^ y) as Unit B. 

1 0 In the step (l) and step (2), scrolling scope of the filter can be adjusted 

optionally, only representational position's output values of the unit image 
need to be working out. In the step (3), if the absolute value of the difference 
between Fu(A>x.y)and Fu(B.x.y)is less than the predetermined threshold, 
the discrimination can't be done., 

1 5 In the step (l), during scrolling the filer and counting convolution, once 

the maximal value of F (A) is more than the predetermined threshold, the 
character unit of the embedded unit image U(x>y )is regarded as Unit A, and 
the discrimination process is over. Similarly in the step (2), once the maximal 
value of F (B) is more than the predetermined threshold, the character unit 

20 of the embedded unit image U (xn y) is regarded as Unit B. 

The above explanation is about the signal detecting (Step S304). Here is 
a discussion about step S305 in FIG. 13. The step S305 is to collect the 
character of the unit image array, generate the data code again, and 
reconvert the original information. 

25 FIG. 18 is schematic view showing an example of reconverting process of 

the information. 

The steps of reconverting process of the information are as follows: 



(1) detecting the symbol of each embedded unit image (FIG. 18®); 

(2) collecting the symbols, reconverting the data code (FIG. 18(2))' 

(3) reading out the embedded information after reconverting the data 
code (FIG. 18(3)). 

5 FIGS. 19~21 shows an example of method for reconverting the data 



code. The reconverting method is substantially a counter-operation of FIG. 8. 

FIG. 19 is a flow chart of reconverting process of the data code. 

FIG. 20 is a schematic view showing the reconverting process of the data 
code. 

10 Step S401^ detecting outline of the unit image array. This step is to 

detect the unit image, which is represent "1", configured around the 
watermark information area in FIG. 10. That is to say, this step is to detect 
the continuous images representation "1" in row direction and column 
direction of the unit image array. If such continuous images are detected, the 

15 region within such continuous images is regarded as the watermark 
information area. The following process is to exclude unit images around the 
outline from the detecting object. 

Step S402- reading out the code length data of the first row of the unit 
image array, getting the code length of the embedded data code. 

20 Step S403- working out embedding times Dn of the data code unit and 

the remainder Rn according to the size of the unit image array and the code 
length of the data code generated from the step S402. 

Step S404: extracting the data code unit in an inverse method to that of 
the step S203 from the second row of the unit image array. As showing in 

25 FIG. 20, disassembling twelve unit images from U (1. 2) (two rows, one 
column) one by one (U (1. 2) ~U (3> 3)^ U (4. 3) ~U (6> 4). — ).Inview 
of Dn=7 and Rn=6, the twelve unit images (data code unit) will be extracted 



for seven times, and six image units (U (4^ 11) ~U (9> 11) ) (substantially 
equal to six image unit of the data code units) will be extracted as a 
remainder. 

Step S405- doing bit confidence operation on the data code unit 
5 outputted from the step S404 and generating the embedded data unit again. 
Following is a detailed illustration about the bit confidence operation. 
FIG. 21 is schematic view showing bit confidence operation. 
As shown in FIG. 21, the data code unit out of the second row and first 
column of the unit image array is defined as Du (1^1) ~Du (12^ 1 ) , the rest 
10 may be deduced by analogy and defined as Du (1^ 2) ~Du (12> 2)> The 
remainder is defined as Du (1. 8) ~Du (6> 8) . Bit confidence operation is 
to check factors of each data code unit according to majority discrimination 
method so as to obtain symbol value of each data code. Even if signal of the 
random unit isn't correctly detected from the random data code unit due to 
15 words' overlapping and spots on the paper (bit rotation errors, etc.), the data 
codes can also be ultimately reverted correctly. 

For example, the first bit of the data code is determined as "1" in the 
case of the majority detecting result of the Du (In 1)n Du (12)n DuCl^ 
8) is "1", while the first data code is determined as "0" in condition that the 
20 majority detecting result of the Du (1. 1). Du (12)> — ^ Du (1> 8) is "0". 
Similarly, the second bit data code is determined according to the majority 
detecting result of Du (2. 1)> Du (2^ 2). — . Du (2. 8) , and the twelfth 
bit data code is determined according to the majority detecting result of Du 
(12^ 1)> Du(12> 2)n . Du(12. 7) ( until Du(12> 7)because of inexistence 
25 ofDu (12. 8) ). 

The bit confidence operation can be carried out through adding the 
output value of the signal detecting filter in FIG 17. For example, symbol 
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" I 0" is distributed to Unit A in FIG. 3 (l), symbol "1" is distributed to Unit B 
in FIG. 3 (2), and the maximal output value of Unit B is Df (B> m. n) , if the 
Mth bit of the data code is denoted with Formula 5, then it's value is "1", 
otherwise, it's value is "0". wherein the Formula 5 is represented as follows^ 

Dn Dn 

5 [Formulas]: ^DfiA,M,n)>^Df{B,M.n) 

However, if N<Rn, the add operation of Df will run until n=l~Rn+l. 
The above illustration is about the repeated embedding data code. 
However, if error-correcting code is introduced during data encoding process, 
it needs not to embed the data symbol unit repeatedly. 
10 The first embodiment of the invention has been described with reference 

to the drawings. Following terms in the present invention should be noted. 

(l) Different arrangements of the dots are used to represent the 
embedding information, which is not change along with changes of the 
original document's font, word space and row space. 
15 (2) Density (dot number per certain area) of the dot pattern been 

distributed the symbol is equal to that of the dot pattern not been 
distributed the symbol. Thus, for naked eye, the document is appended with 
reticulation texture of same density, and the watermark information is 
inconspicuous . 

20 (3) If the dot pattern been distributed the symbol and the dot pattern 

not been distributed the symbol are encrypted, it will hardly to unscramble 

the embedded information. 

(4) The image representing the information is substantially a 

combination of the unit images and is embedded entirely as a backdrop of the 
25 document. Thus, even if an embedding arithmetic is opened, it is hard to 

tamper the embedded information of the print document. 
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(5) The embedded signal is detected according to shade variety in wave' 
DOA (direction of arrival). It needs not to be precision-detected with respect 
to one pixel unit. Therefore, the embedded information can be detected 
reliably even there are some spots on the print document. 
5 (6) The same information is embedded repeatedly, and such embedded 

information is reversed during detecting process. Even if the font of the word 
is relatively big and overlaps partial of signals, or some information is 
vanished due to spots on the print document, etc, the embedded information 
can also be detected reliably. 

1 0 (7) Because unit images with special value are continuously configured 

around the watermark information area, even if the containing watermark 
document 200 is folded, stretched, etc, the watermark information area can 
also be detected correctly. Thus, a reliably detection method of the 
watermark information is obtained. 

1 5 [Second Embodiment] 

A second embodiment of the present invention provides a method for 
using PN (Pseudo Noise) code to diffuse watermark information so as to 
generate a watermark image. 

FIG. 22 is a configuration of the second embodiment in the present 

20 invention. 

The apparatus shown in FIG, 22 is composed of a watermark 
information embedding apparatus 100a and a watermark information 
detecting apparatus 300a. A containing watermark document 200 is the 
containing watermark document generated by the watermark information 
25 embedding apparatus 100a. 

As shown in FIG. 22, the watermark information embedding apparatus 
100 comprises a document image generator 101, a watermark image 
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generator 102a, a containing watermark document image synthesizer 103 
and an output device 104, document data 105, watermark information 106 
and PN (Pseudo Noise) code generator 107. The mentioned document image 
generator 101, containing watermark document image synthesizer 103, 
5 output device 104, document data 105 and watermark information 106 in the 
second embodiment are same to those in the first embodiment and are 
denoted with same symbols, description about those elements is omitted here. 
The PN code generator 107 generates PN codes according to the watermark 
information. The watermark image generator 102a generates a watermark 

10 image by diffusing the PN codes. The PN code generator 107 utilizes known 
method, which will be discussed later, to generate the PN codes (pseudo 
random sequence). 

The watermark information detecting device 300a comprises an input 
device 301 and a watermark detector 302a. The watermark detector 302a is 

15 correlative with the PN code. The watermark detector 302a is provided to 
detect the watermark information area, read out the watermark information 
from the watermark information area, discriminate whether the watermark 
information has been detect correctly according to the peak value of the 
correlation value of the PN codes. If the watermark information is not 

20 correctly detected out from the watermark information area, the watermark 
detector 302a runs prescript correction function. 

First to illustrate the PN code before discuss all steps of the watermark 
information embedding apparatus 100a. 

Here first to discuss the longest code sequence. The longest code sequence is 
25 the code column of n bit. Wherein the number of "0" and "1" is equal or it's 
discrepancy is less than one. Moreover, if the phases are same, the auto 
correlation of the longest code sequence is one, otherwise the auto correlation 

28 



is zero or the code column of — 1/n bit. The longest code sequence is used in 
code diffusion and synchronization modification of the synchronous and 
single -user CDMA (code Division Multiple Access). The longest code 
sequence is generated by summation feedback of a shift register of some 
length. For example, if n is used to represent the stage of the shift register, 
L=2 n— 1 is represented length of the longest code sequence, which is 
generated by a shift register code generator, as will be discussed later. 

FIG. 23 is a schematic view showing configuration of the shift register 
code generator. 

As seen in FIG. 23, rj is represented status of each shift register, si is 
represented multiplier coefficient (0 or l) of each shift register. The shift 
register code generator is composed of a multilevel feedback logic-coupling 
loop. The longest code sequence generated by the shift register code 
generator is oj, which can be calculated by following- 

[Formula 6] 

FIG. 24 is a schematic view showing configuration of 4 longest code 
sequence generator. 

If In FIG. 23, sO=l, sl=l, s2=s3=0, the initial value of the shift register 
is 0001, the code generator shown in FIG 4 will generate the longest code 
sequence of fifteen periods (sequence length is 15), which can be represented 
as "000100110101111 000100110101111 0001001...". It should be noted 
that unit of the underline is one period. 

000100110101111 000100110101111 0001001... 

The auto correlation function of the longest code sequence is calculated 
by the average of the product of one period and values staggered bit by bit in 
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time. For example, if the bit value as "1" of the code sequence is denoted with 
mi=l, the bit value as "0" is denoted with mi=-l, the auto correlation 
function can be expressed as following*. 

J £-1 

[Formula 7] : R(t) = — ^ ^i^i+t 

^ 1=0 

if tmodL=0, then the auto correlation value is 1, otherwise, the auto 
correlation is -l/L. 

FIG. 25 is a schematic view showing the auto correlation function of the 
longest code sequence. 

As shown in FIG. 25, when calculate correlation value of same longest 
code sequence, result of same phases is "1", while result of staggered phases 
is " —l/L". Therefore, it is easy to realize synchronization through calculating 
correlation between diffused signal of the longest code sequence and the 
longest code sequence used in diffusion. 

Because the code sequences of same period are few, the longest code 
sequence utilizes combination of multiple shift register code generators to 
generate Gold sequence in the case of non-sync CDMA and mixed sequence. 

Following is an illustration about the PN code generator 107 and 
watermark image generator 102a of FIG. 22, which are different with those 
in the first embodiment. 

The PN code generator 107 generates the PN code. Utilize the PN code 
to generate the Gold sequence through the shift register code generator of 
FIG. 23, or Pseudo Random sequence by other methods. The sequence can be 
generated by dynamic generation method of the code generators, or static 
generation method of tables. 

The watermark image generator 102a utilizes the PN code generated 
from the PN code generator 107 to diffuse the information bit column 
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generated from the watermark information 106 thereby generating the 
watermark image. 

FIG. 26 is a schematic view showing generation of the watermark image. 
Method of utilizing the PN code to diffuse can be realized as follow rules- 
5 (1) If the embedded information bit is "0", the PN code is used 

unbrokenly. 

(2) If the embedded information bit is "1", the PN code is used after 
reversed operation. 

(3) Each information bit quantity and the PN code are concatenated and 
1 0 embedded continuously with images. 

In condition that the length of the PN code is 15 bits, the information bit 
N is diffused to (NX 15) bits via using the PN code. The unit images of the 
diffused code sequence are clinker-built configured just like those in the first 
embodiment, so as to generate the watermark image. The unit images can be 
15 configured neither in transverse direction nor longitudinal direction. The 
example shown in FIG. 26 is the whole watermark image is generated by the 
diffusion of the PN code. 

The watermark image is combined images diffused by the PN code 
sequence in the case of the unit image representation 1 bit. If the unit image 
20 represents 2 bits or more, the defused sequence can either be embedded in 
depth direction of bits (depth direction in condition that each bit is regarded 
as a bitplane), or be embedded only in a plane of special bit depth. 

The following is a description about the watermark information 
detecting apparatus 300a focused on differences with those in the first 
25 embodiments. 

The steps of the input device 301 in the second embodiment are same to 
that in the first embodiment. The watermark detector 302a calculates the 
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correlation value of the signal outputted from the input device 301 and the 
PN sequence used in embedding process. 

FIG. 27 is a schematic view showing process of the watermark detector. 
Calculate the correlation value according to an order of arrangement of 
5 the embedded PN sequence. For example, calculate the correlation value 
staggered bit by bit. If the embedded PN sequence is configured transversely, 
the staggered direction is a transverse direction, while if the embedded PN 
sequence is configured longitudinally, the staggered direction is a 
longitudinally direction. 
10 If the phase is zero, the correlation value of the longest code sequence is 

"1", or the correlation value is "-l/L". Therefore, among the output values of 
the filters of the watermark detector 302, the correlation value of the output 
values that are actually embedded as the watermark is high. The correlation 
values fluctuate according to noises, only high correlation values are 
15 detected via using thresholds so as to establish the position of the embedded 
watermark. If the embedded sequence is not the longest code sequence, the 
maximum correlation value appears in same phases, in-phase is also 
possible. The correlation value is worked out through the above mentioned 
[Formula 7] and can also be expressed as following" 
20 (correlation value)= I (the consistent numbers between the code and 

the PN code) — (the inconsistent numbers between the code and the PN 
code) I -^(the length of the PN code) 
As seen in the above expression, the maximum correlation value is not 
always reset to "1". Same PN code sequences are configured in row unit 
25 synchronized. Correlation of the PN code sequence is shown in FIG. 27 (2). 

The diffused value can be got according to the above mentioned rules 
(l) ~(3) of the watermark information embedding apparatus 100a and 
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following conditions. 

(the consistent numbers between the code and the PN code) ^(the 
inconsistent numbers between the code and the PN code)^ 0 

(the consistent numbers between the code and the PN code) <(the 
5 inconsistent numbers between the code and the PN code)^ 1 

The combination of the code sequences is detected as the watermark 
image area. The correlation value between such code sequence and the PN 
code sequence is higher than the threshold. 

The correlation with the PN code certainly occurs in the period interval 
10 of the PN code. In condition that the interval is over the PN code period, 
because of the scanned paper is stretched, etc, clearance may be occurs in 
signal interval, and improper information may be extracted. In this case, in 
this embodiment, data of the interval is deleted so as to overcome the 
mentioned problems. In the case of the interval is not reach the PN code 
1 5 period, if information is lost during detecting process due to paper's folds, 
dummy data (for example, all bits are "0") are introduced in so as to amend 
the lost information. In the case of deleting information and introducing 
dummy data, error odds here are increasing. In order to establish signal 
synchronization of the entire containing watermark document 200a, error 
20 correction code and data code are synchronized thereby improving correction 
detection odds. 

Moreover, information's deletion/introduction are taken place in the 
maximum interval in signal detecting by filter (the above mentioned step 
S304) so as to decrease error odds. 
25 The above illustration is about diffusion process of the entire watermark 

image by the PN code. However, it also can carry through only around the 
watermark area or in partial area within the outline of watermark image 



just like the first embodiment. In this case, utiUze estabhshing new rules 
about arrangement of information, which are defused by the PN code, to 
extract the watermark image and amend the improper information 
generated by factors of folding, stretching, etc. 

As discussed above, the second embodiment discloses a method for using 
the PN code to diffuse the watermark information so as to generate the 
watermark image. The second embodiment is of all effects of the first 
embodiment. Furthermore, in the second embodiment, in the case of 
detecting the watermark information, the detection precision is increased. 
Therefore, even if the information is lost for folding or stretching factors, the 
second embodiment can provide corresponding process, such as introducing 
into error correction code, to increase proper extraction odds. 
[Third Embodiment] 

A third embodiment of the present invention provides examples of using 
multiple PN codes according to row unit or column unit. 

The configuration of the third embodiment is same to the second 
embodiment. Referring to FIG. 22 again, a PN code generator 107 in the 
third embodiment can generate multiple PN code sequences. A watermark 
image generator 102a in the third embodiment can utilize the multiple PN 
code sequences generated by the PN code generator to perform diffusion 
process with respect to row unit or column unit thereby generating 
watermark image. Other elements of the third embodiment are same to 
those in the second embodiment, a detailed description is omitted here. 

Following is a discussion about using the multiple PN code sequences 
(there are two difference PN code sequences in this embodiment) to perform 
the diffusion process with respect to row unit. 

Firstly, processes of the watermark information embedding apparatus 
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100a are given. 

The PN code generator 107 generates N kinds of difference PN code 
sequences (in this example, there are two kinds: PN Code A sequence, PN 
Code B sequence). The watermark image generator 102a generates a 
5 watermark image by using the PN code sequence to diffuse with respect to 
each row unit. For example, there are N kinds of PN code sequences, PN code 
of the first row is "0", PN code of the second row is "1", — , PN code of the 
Nth row is "N-1", PN code of the (N-hl)th row is "0". — . The PN code 
sequence is switched for diffusion according to such rule. Other processes in 
1 0 the third embodiment are same to those in the second embodiment. 

After that, processes of the watermark information detecting apparatus 
300a are discussed. 

FIG. 28 is a schematic view showing processes of a third embodiment 
(part one). 

15 As shown if FIG. 28 (l), calculate total correlation value of the output 

value (the value extracted according to unit image, referring to FIG. 17) from 
the filter of the watermark detector 302a and the N kinds of PN code 
sequences used in embedding process. In this example, it is supposed that 
N=2. 

20 If the longest code sequence and the Gold code sequence are used as the 

PN code sequence, the correlation between those sequences and other code 
sequences is lower. Only when sequences are same and synchronous to those 
used in embedding, the correlation values are greater. 

The PN code kind from which can get peak correlation value is 

25 substantially presumed as address of the row unit. For example, rows of 
greater correlation with the PN Code A sequence are the first row and the 
(n+l)th row (n is a natural number which is equal to 1 or more ). Rows of 
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greater correlation with the PN Code B sequence are the second row and the 
(n+l)th row (n is a natural number which is equal to 1 or more). Therefore, 
number 0~n of the PN Code can be regarded as row address of the 
embedded information. If there are difference between the actual address 
5 and the row address, methods of deleting information, inserting dummy data 
or others can be used to correction the difference so as to obtain proper 
address that same to the original address. 

If the dummy data are inserted in the case of deletion information, 
detection error odds here are increasing. In order to obtain all signals' 
10 synchronization, synchronize the error correction codes and data codes so as 
to increase proper detection odds. Furthermore, methods of deletion 
information and insertion dummy data are carried through in the maximum 
interval of the signal detection so as to decrease error odds. 

Following is an illustration of using the multiple PN code sequences to 
1 5 perform the diffusion process with respect to column unit. 

Firstly, processes of the watermark information embedding apparatus 
100a are given. 

The PN code generator 107 generates N kinds of difference PN code 
sequences (in this example, there are two kinds- PN Code A sequence, PN 

20 Code B sequence). The watermark image generator 102a generates a 
watermark image by using the PN code sequence to diffuse with respect to 
each column unit. When there are N kinds of PN code sequences, PN code of 
the first column is "0", PN code of the second column is "1", , PN code of 
the Nth column is "N-l", PN code of the (N+l)th column is "0". The PN 

25 code sequences are switched for diffusion according to such rule. 

Processes of the watermark information detecting apparatus 300a are 
discussed as following. 
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FIG. 29 is a schematic view showing processes of the third embodiment 
(part two). 

As shown if FIG. 29 (l), similar to the row unit, calculate total 
correlation value of the output value (the value extracted according to unit 
5 image) from the filter of the watermark detector 302a and the N kinds of PN 
codes used in embedding process. In this example, it is supposed that N=2, 

The PN code's kind from which can get peak correlation value is 
substantially presumed as address of the column unit. For example, columns 
of greater correlation with the PN Code A sequence are the first column and 
10 the (n+l)th column (n is a natural number which is equal to 1 or more ). 
Columns of greater correlation with the PN Code B sequence are the second 
column and the (n+l)th column (n is a natural number which is equal to 1 or 
more). Therefore, number 0 — n of the PN Code sequences kind can be 
regarded as column address of the embedded information. If there are 
15 difference between the actual address and the column address, methods, 
such as deleting information, inserting dummy data, etc, can be used to 
correction the difference so as to obtain proper address that same to the 
original address. 

If the dummy data are inserted in the case of deletion information, 
20 detection error odds here are increasing. In order to obtain all signals' 
synchronization, synchronize the error correction codes and data codes so as 
to increase proper detection odds. Furthermore, methods of deletion 
information and insertion dummy data are carried through in the maximum 
interval of the signal detection so as to decrease error odds. 
25 As discussed above, the third embodiment discloses a method for using 

the PN code to diffuse the watermark information with respect to each row 
unit or column unit so as to generate the watermark image. The third 
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embodiment is of all effects of the second embodiment. Furthermore, in the 
third embodiment, signals' absolute addresses are embedded, so errors 
generated for folding or stretching can be corrected. Therefore, even the 
information in transverse and longitudinal are asynchronous for factors of 
5 folding or stretching, the third embodiment can provide corresponding 
process to synchronize the information, thereby increasing proper extraction 
odds. 

[Fourth Embodiment] 

A fourth embodiment of the present invention is an example of 

1 0 two-dimensional PN code. 

The configuration of the fourth embodiment is same to the second 
embodiment. Referring to FIG. 22 again, a PN code generator 107 in the 
fourth embodiment can generate multiple PN code sequences. A watermark 
image generator 102a in the fourth embodiment can utilize the multiple PN 

15 code sequences generated by the PN code generator to perform diffusion 
process thereby generating the watermark image. Other elements of the 
fourth embodiment are same to those in the second and third embodiments, 
a detailed description is omitted here. 

The following description is about processes of the watermark 

20 information embedding apparatus 100a. The PN code generator 107 
generates two-dimensional codes. 

FIG. 30 is a schematic view showing two-dimensional PN code sequence. 
As shown in FIG. 30, the PN code sequence generated in a horizontal 
direction are PN Code A sequence, and in a vertical direction are PN Code B 

25 sequence. The bit values of the PN Code A sequence and the PN Code B 
sequence are reversed. For example, bit value of the second PN Code A from 
top beginning is equal to the bit value of the second PN Code B from top 
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beginning. Each bit value is reversed. Each bit value of the two-dimensional 
PN code representation "0" is reversed to that of the two-dimensional PN 
code representation "1". 

The watermark image generator 102 generates a watermark image via 
5 diffusion process through using the two-dimensional PN code. In this 
embodiment, the watermark information is diffused by the two-dimensional 
PN code. Other procedures in this embodiment are same to those in the 
second embodiment. 

Processes of the watermark information detecting apparatus 300a are 
1 0 discussed as following. 

FIG. 31 is a schematic view showing detecting process of the 
two-dimensional PN code in the fourth embodiment. 

When the watermark detector 302a detect the two-dimensional PN code 
sequence, the PN code sequences in the horizontal and vertical direction are 
15 used respectively to calculate the correlation values. As seen in FIG. 31 (l), 
when calculate the correlation value of the horizontal PN code sequence, the 
peak correlation values appear continuously in the vertical direction. As 
shown in FIG, 31 (2), when calculate the correlation value of the vertical PN 
code sequence, the peak correlation values appear continuously in the 
20 horizontal direction. The points of intersection of such continuous peak 
correlation values are the vertexes of the two-dimensional PN code sequence. 
The address and range of the two-dimensional PN code sequence can be 
calculated according to the vertexes of the two-dimensional PN code 
sequence. 

25 FIG. 32 is an example of two-dimensional PN code sequence. 

As shown in FIG. 32, the two-dimensional PN code sequence is 
configured with PN code 0, PN code 1, PN code 2, in a horizontal and 
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vertical direction respectively. For example, the two-dimensional PN code 
sequence in FIG. 32 is configured with bit array of the PN code 1 in the 
horizontal direction and bit array of the PN code 2 in the vertical direction. 
Each two-dimensional PN code sequence includes inherent row address and 
5 column address. As a result, the location of the two-dimensional PN code can 
be detected reliably from the detection side. 

As discussed above, in the fourth embodiment, the two-dimensional PN 
code is taken as the diffusion PN code to diffuse the watermark information. 
The fourth embodiment is of all effects of the second embodiment. 
10 Furthermore, in the fourth embodiment, since a two-dimensional PN code 
has row address and column address, the location of the two-dimensional PN 
code is confirmed. As a result, even the containing watermark document 
200a is folded or stretched; the watermark information can also be extracted 
correctly. 
1 5 [Fifth Embodiment] 

A fifth embodiment in the present invention is an example of 
three-dimensional PN code. 

The configuration of the fourth embodiment is same to the second 
embodiment. Referring to FIG. 22 again, document data 105 to be embedded 
20 in the fifth embodiment is not one page, but multipage. The document image 
generator 101 can generate a document image from such multipage 
document data. The PN code generator 107 in the fifth embodiment 
generates three-dimensional PN code sequence from the multipage 
document data. The watermark image generator 102a utilizes the 
25 three-dimensional code sequence generated from the PN code generator 107 
to diffuse the watermark information 106 so as to generate the watermark 
image. Other configurations are same to those in the second and fourth 
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embodiments. 

Processes of the watermark information embedding apparatus 100a are 
discussed as following. 

The PN code generator 107 generates three-dimensional code sequence. 

FIG. 33 is an example of three-dimensional PN code sequence. Referring 
to FIG. 33, the three-dimensional PN code is configured by generating PN 
Code C in a page direction on the basis of a two-dimensional PN code which 
includes PN Code A in a horizontal direction and PN Code B in a vertical 
direction. The two-dimensional PN code is identical with that in the fourth 
embodiment, a detailed description about the two-dimensional PN code is 
omitted here. 

The watermark image generator 102a utilizes the three-dimensional PN 
code to diffuse the watermark information so as to generate the watermark 
image. The two-dimensional PN code (including PN Code A and PN Code B) 
of each plane in the three-dimensional PN code is embedded in each page of 
the multipage document image. 

Description about processes of the watermark information detecting 
apparatus 300a is given as following. 

When the watermark detector 302a detect out the watermark 
information form the containing watermark document 200 having multipage 
document data, the watermark detector 302a extracts the two-dimensional 
PN code from each page. The process of extracting the two-dimensional PN 
code is same to that in the fourth embodiment, a detailed illustration about 
this is omitted here. Calculate correlations of each plane value of the 
two-dimensional code and multi PN codes in the page direction (PN Code C 
sequence). If the correlation is lower, it is shown that page's deletion or 
insertion procedure is carried through. Other processes of the fifth 
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embodiment are same to those in the second and fourth embodiments. 

In the fifth embodiment, the two-dimensional PN code is expanded to 
form the three-dimensional PN code. Thus, the fifth embodiment is of all 
effects of the second embodiment. Furthermore, in the fifth embodiment, 
5 even the containing watermark document 200a consists of multipage 
document data; the watermark information can also be detected correctly. 

As discussed above, because the watermark image area is surrounded by 
the dot pattern with special value, the watermark image area can be 
detected correctly. As a result the watermark information can be detected 

10 correctly. Moreover, the present invention utilizes the PN code sequence to 
diffuse the watermark information, so the watermark image area can be 
detected correctly, and the signal aberration owing to image aberration and 
folding can be corrected. As a result, the watermark's detection is not easy to 
be effected by factors of folding, stretching, etc. 

1 5 Following terms of the present invention should be noted. 

(a) In the first embodiment, the unit images surround the outline of the 
watermark area. However, if the important document, just like the two 
adjacent sides of the rectangle, can make an explicit record area of the 
watermark information, it needs not to overlap fully around the outline of 

20 the watermark information area. In the embodiments 1 through 5, the record 
area as the watermark information area is not limited to rectangle, other 
shape such as roundness is also accepted. In the first embodiment, the value 
of the watermark information record area's outline is defined as special 
value, so any shape of the record area will get the same effect. In the 

25 example in the first embodiment, the value of the outline is "1". However the 
value "0" of the outline is also accepted. 

(b) In the embodiments 2 through 5, as long as the embedding side and 
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the detecting side have the common normalized code sequence, the longest 
code sequence (M sequence), the Gold code sequence, or other random code 
sequence can be used as the PN code sequence, 

(c) In the fifth embodiment, the three-dimensional PN code sequence is 
5 used as the embedding/detecting PN code sequence. The embedded 

information which is not only the static image, but also the cube such as 
dynamic image with multi-frame, three-dimensional object, etc, can be 
detected. If the embedded information is the dynamic image with 
multi-frame, the frame number and the address are detected. Vanish and 
10 insertion of the frame can also be detected in condition that the embedded 
information is the dynamic. 

(d) In the embodiments 2 through 5, the information to be diffused by 
the PN code can not only be the total of the digital watermark area but also 
be the outline (outline border) of the digital watermark area. The unit 

1 5 images can be configured around the outline of the watermark area at fifty 
pixels intervals or at random regular intervals. However, the configuration 
rule at the embedding side and the detecting side are identical. 

(e) In the embodiments 2 through 5, the watermark information is 
represented by configuring dot pattern around the watermark image in one 

20 surface. However, the present invention is not limited to this method. As 
long as the method like the watermark information 106 embedded in the 
document is also accepted. 

(f) In the embodiments 1 through 5, the containing watermark document 
200, 200a is paper print document. However, the containing watermark 

25 document can also be other medium, such as image on the display. 
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